The two-dimensional discrete wavelet transform has a huge number ofapplications in image-processing techniques. Until now, several papers comparedthe performance of such transform on graphics processing units (GPUs). However,all of them only dealt with lifting and convolution computation schemes. Inthis paper, we show that corresponding horizontal and vertical lifting parts ofthe lifting scheme can be merged into non-separable lifting units, which halvesthe number of steps. We also discuss an optimization strategy leading to areduction in the number of arithmetic operations. The schemes were assessedusing the OpenCL and pixel shaders. The proposed non-separable lifting schemeoutperforms the existing schemes in many cases, irrespective of its highercomplexity.
展开▼